Skip to content

Ray Serve readiness checks only proxy#4480

Open
spencer-p wants to merge 3 commits intoray-project:masterfrom
spencer-p:unified-health
Open

Ray Serve readiness checks only proxy#4480
spencer-p wants to merge 3 commits intoray-project:masterfrom
spencer-p:unified-health

Conversation

@spencer-p
Copy link
Contributor

@spencer-p spencer-p commented Feb 4, 2026

Why are these changes needed?

The RayService readiness check uses wget to fetch both the raylet health and the proxy actor status. It's unclear if both are needed. If we can check only the raylet for liveness, and only the proxy for readiness, then we can completely remove dependencies on wget on 2.53 and later.

Related issue number

Follow up to #4448.

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(
Cursor Bugbot reviewed your changes and found no issues for commit 96d27be

Signed-off-by: Spencer Peterson <spencerjp@google.com>
@400Ping
Copy link
Contributor

400Ping commented Feb 5, 2026

I want to ask about since the slim images (without wget) are only being released for Ray 2.53+, does older versions need to keep using wget?

Copilot AI mentioned this pull request Feb 5, 2026
4 tasks
Co-authored-by: Jun-Hao Wan <ken89@kimo.com>
Signed-off-by: Spencer Peterson <spencerjp@google.com>
@spencer-p
Copy link
Contributor Author

does older versions need to keep using wget?

Older versions need to use the exec probe for cases where there's multiple endpoints, which is the case for the all the liveness probes and all the non-serving readiness probes.

I suppose if we agree the Serve readiness check only needs to be the proxy endpoint, then we should be able to drop wget for this probe only on pre-2.53 versions.

if httpHealthCheck {
rayContainer.ReadinessProbe.Exec = nil
rayContainer.ReadinessProbe.HTTPGet = &corev1.HTTPGetAction{
Path: utils.RayServeProxyHealthPath,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes, but just calling out the assumption that the server proxy health check likely depends on the raylet health already.

@rueian @Future-Outlier does it make sense to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there ever be a case where serve proxy health check passes but raylet probe fails? Seems unlikely to me but maybe I'm missing something

Signed-off-by: Spencer Peterson <spencerjp@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants